Skip to content

GH-49155: [C++][IPC] Allow disabling extension type deserialization#49157

Closed
AliRana30 wants to merge 14 commits intoapache:mainfrom
AliRana30:feature/ipc-extension-type-filter
Closed

GH-49155: [C++][IPC] Allow disabling extension type deserialization#49157
AliRana30 wants to merge 14 commits intoapache:mainfrom
AliRana30:feature/ipc-extension-type-filter

Conversation

@AliRana30
Copy link
Contributor

@AliRana30 AliRana30 commented Feb 5, 2026

Rationale for This Change

Applications consuming IPC data from untrusted sources may want to avoid executing potentially buggy third-party extension type deserialization code. Currently, there is no mechanism to disable extension type deserialization when reading IPC files or streams. This creates a security and robustness concern for applications that prefer to work with storage types instead of risking crashes or undefined behavior from custom deserialization implementations in third-party extension types.

What Changes Are Included in This PR?

This change adds a new boolean field extension_types_blocked to the IpcReadOptions struct. When set to true, extension types encountered during IPC deserialization are returned as their underlying storage types instead of calling custom ExtensionType::Deserialize() methods.

Are These Changes Tested?

Yes. The implementation is backward compatible by design—the default value of extension_types_blocked is false, which preserves all existing behavior.All current tests continue to pass without modification.

Are There Any User-Facing Changes?

Yes. A new option is available in the IpcReadOptions API:

auto options = IpcReadOptions::Defaults();
options.extension_types_blocked = true;
auto reader = RecordBatchFileReader::Open(file, options);

Alirana2829 and others added 14 commits February 1, 2026 18:54
The TEST(TestSparseCSFIndex, EqualsMismatchedDimensions) test created
SparseCSFIndex objects with empty tensors (nullptr buffers, 0-length shape),
causing segfaults during validation on ASAN/UBSAN and 'front() called on
empty vector' errors on MSVC. The typed test TestEqualityMismatchedDimensions
already properly validates the fix with valid CSF index structures.
Keep only essential size checks. Maintainers requested reverting
formatting changes to reduce diff noise and improve readability.
The axis_order().size() check was unnecessary because vector equality
operator already compares sizes. Keeping only the essential checks for
indices() and indptr() that prevent segfault from out-of-bounds access.
Co-authored-by: Rok Mihevc <rok@mihevc.org>
Co-authored-by: Rok Mihevc <rok@mihevc.org>
GitHub's 'Commit suggestion' feature added rok's implementation
but didn't remove the old code, causing duplicate definition error.
Removed old implementation to keep only the cleaner C++20 ranges version.
Co-authored-by: Rok Mihevc <rok@mihevc.org>
…tion

Add extension_types_blocked option to IpcReadOptions to allow users to disable extension type deserialization for security/robustness.

When enabled, extension types are returned as their storage types instead of calling custom deserialization code.
@github-actions
Copy link

github-actions bot commented Feb 5, 2026

⚠️ GitHub issue #49155 has been automatically assigned in GitHub to PR creator.

@AliRana30 AliRana30 closed this Feb 5, 2026
@AliRana30 AliRana30 deleted the feature/ipc-extension-type-filter branch February 5, 2026 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants